Instruction Cache Designs for a Class of Statically Scheduled Instruction Level Parallel Architectures

نویسندگان

  • Thomas M. Conte
  • Sanjeev Banerjia
  • Sergei Y. Larin
  • Kishore N. Menezes
  • Sumedh W. Sathaye
چکیده

Statically-scheduled architectures such as very long instruction word (VLIW) architectures use very wide instruction words in conjunction with high bandwidth to the instruction cache to achieve multiple instruction issue. The encoding used for the instructions can have an e ect on the requirements placed on the instruction fetch and instruction cache hardware. One type of encoding is a compressed encoding , named so because it does not explicitly store NOPs within the wide instruction word. A compressed encoding enables high memory utilization but at the expense of variable-sized instructions and the complexities associated with fetching variable-sized instructions. This paper examines instruction fetch and instruction cache mechanisms for VLIW architectures that use compressed encodings. Relevant issues are investigated using the TINKER experimental testbed. A taxonomy for instruction caches for VLIW architectures that use a compressed encoding is introduced. Four cache organizations from di erent categories within the taxonomy are presented: the uncompressed cache, the banked cache, the rigid silo cache, and exible silo cache. The designs are evaluated using trace-driven simulations. The results indicate that the banked cache is the best performer in terms of storage area requirements/program performance, and the silo cache designs could be appropriate in designs where storage limitations are not an issue or the characteristics of the applications to be executed are well-known.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Method and apparatus for the selective scoreboarding of computation results

Statically scheduled machines do have a disadvantage when dealing with dynamic events, such as cache hit or miss detection. Early VLIW machines were designed without caches, to achieve predictability in memory access. However, such designs suffer in memory performance. To achieve high performance, VLIW architectures must have adequate support for using caches. A simple VLIW design might use an ...

متن کامل

Simple ASIC Complex ASIC RaPiD FPGA GARP DPGA SuperSpeculative RAW TRACE ( Multiscalar ) SMT VECTOR

Poor scalability of Superscalar architectures with increasing instruction-level parallelism (ilp) has resulted in a trend towards statically scheduled horizontal architectures such as Very Large Instruction Word (vliw) processors and their more sophisticated successors called Explicitly Parallel Instruction Computing (epic) architectures. We extend the epic model with additional capabilities to...

متن کامل

Aligned Scheduling: Cache-Efficient Instruction Scheduling for VLIW Processors

The performance of statically scheduled VLIW processors is highly sensitive to the instruction scheduling performed by the compiler. In this work we identify a major deficiency in existing instruction scheduling for VLIW processors. Unlike most dynamically scheduled processors, a VLIW processor with no load-use hardware interlocks will completely stall upon a cache-miss of any of the operations...

متن کامل

Thesis - Vasileios Porpodas

Very Long Instruction Word (VLIW) processors are wide-issue statically scheduled processors. Instruction scheduling for these processors is performed by the compiler and is therefore a critical factor for its operation. Some VLIWs are clustered, a design that improves scalability to higher issue widths while improving energy efficiency and frequency. Their design is based on physically partitio...

متن کامل

Evaluating Compiler Support for Complexity Effective Network Processing

Statically scheduled processors are known to enable low complexity hardware implementations that lead to reduced design and verification time. However, statically scheduled processors are critically dependent on the compiler to exploit instruction level parallelism and deliver higher performance. In order to ascertain the suitability of statically scheduled processors for network processing (wh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007